Stimulus Representation and the Timing of Reward-Prediction Errors in Models of the Dopamine System

نویسندگان

Elliot A. Ludvig

Richard S. Sutton

E. James Kehoe

چکیده

The phasic firing of dopamine neurons has been theorized to encode a reward-prediction error as formalized by the temporal-difference (TD) algorithm in reinforcement learning. Most TD models of dopamine have assumed a stimulus representation, known as the complete serial compound, in which each moment in a trial is distinctly represented. We introduce a more realistic temporal stimulus representation for the TD model. In our model, all external stimuli, including rewards, spawn a series of internal microstimuli, which grow weaker and more diffuse over time. These microstimuli are used by the TD learning algorithm to generate predictions of future reward. This new stimulus representation injects temporal generalization into the TD model and enhances correspondence between model and data in several experiments, including those when rewards are omitted or received early. This improved fit mostly derives from the absence of large negative errors in the new model, suggesting that dopamine alone can encode the full range of TD errors in these situations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Use of Fundamental Color Stimulus to Improve the Performance of Artificial Neural Network Color Match Prediction Systems

In the present investigation attempts were made for the first time to use the fundamental color stimulus as the input for a fixed optimized neural network match prediction system. Four sets of data having different origins (i.e. different substrate, different colorant sets and different dyeing procedures) were used to train and test the performance of the network. The results showed that th...

متن کامل

Time, Not Size, Matters for Striatal Reward Predictions to Dopamine

Midbrain dopamine neurons encode reward prediction errors. In this issue of Neuron, Takahashi et al. (2016) show that the ventral striatum provides dopamine neurons with prediction information specific to the timing, but not the quantity, of reward, suggesting a surprisingly nuanced neural implementation of reward prediction errors.

متن کامل

Temporal Specificity of Reward Prediction Errors Signaled by Putative Dopamine Neurons in Rat VTA Depends on Ventral Striatum

Dopamine neurons signal reward prediction errors. This requires accurate reward predictions. It has been suggested that the ventral striatum provides these predictions. Here we tested this hypothesis by recording from putative dopamine neurons in the VTA of rats performing a task in which prediction errors were induced by shifting reward timing or number. In controls, the neurons exhibited erro...

متن کامل

The emotive brain, the noradrenergic system, and cognition

Motivation and attention can have a profound influence on perception, learning and memory. Neuromodulatory systems, especially the noradrenergic (NE) system, co-vary with psychological states to modulate cortical arousal, influence sensory processing and promote synaptic plasticity. There is even some suggestion that the NE system might facilitate functional recovery after brain damage. Post-sy...

متن کامل

Dopamine Ramps Are a Consequence of Reward Prediction Errors

Temporal difference learning models of dopamine assert that phasic levels of dopamine encode a reward prediction error. However, this hypothesis has been challenged by recent observations of gradually ramping stratal dopamine levels as a goal is approached. This note describes conditions under which temporal difference learning models predict dopamine ramping. The key idea is representational: ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

Neural computation

دوره 20 12 شماره

صفحات -

تاریخ انتشار 2008

Stimulus Representation and the Timing of Reward-Prediction Errors in Models of the Dopamine System

نویسندگان

چکیده

منابع مشابه

The Use of Fundamental Color Stimulus to Improve the Performance of Artificial Neural Network Color Match Prediction Systems

Time, Not Size, Matters for Striatal Reward Predictions to Dopamine

Temporal Specificity of Reward Prediction Errors Signaled by Putative Dopamine Neurons in Rat VTA Depends on Ventral Striatum

The emotive brain, the noradrenergic system, and cognition

Dopamine Ramps Are a Consequence of Reward Prediction Errors

عنوان ژورنال:

اشتراک گذاری